Characterization of movie genre based on music score [confusion matrix]


The analysis is performed over 40 tracks from each genre category

Genre classification with all four genres was performed using support vector machines in a ten-fold cross-validation test. The darker the color grey, the better the prediction of the tracks. From the confusion matrix, it is clear that there is a clear diagonal dark-grey line, from the upper left to the bottom right. This means, that the model predicted these the best.


Precision recall for all features
class precision recall
Action 0.5250000 0.525
Feelgood/Comedy 0.5897436 0.575
Horror 0.6666667 0.700
Romance/Drama 0.6410256 0.625
Precision recall for timbral features
class precision recall
Action 0.5957447 0.700
Feelgood/Comedy 0.5862069 0.425
Horror 0.7567568 0.700
Romance/Drama 0.6382979 0.750

It is clear that overall, all features predict movie genre better than only timbral features. Feelgood/comedies are most often confused with Romantic dramas and Action for timbral features. This means that the timbral features of feelgood/comedies some what similar are to Romantic dramas and action movies. Timbral features from horror and feelgood/comedies are the most distant from each other.

NEW Forest model


From the forest model, it is very clear that there are some features that really characterize the four movie genres:

Track level features: - Valence - Acousticness - Danceability - Loudness - Instrumentalness timbral components: - c01 - c06 Key: - A

Next, we’ll perform a new analysis on these 8 components that really differ

NEW Confusion matrix with characterizing components


When looking only at the top 8 components from the previous page and compare it with the other:


Precision recall for selected features
class precision recall
Action 0.5454545 0.450
Feelgood/Comedy 0.6153846 0.600
Horror 0.6428571 0.675
Romance/Drama 0.6086957 0.700
Precision recall for all features
class precision recall
Action 0.5250000 0.525
Feelgood/Comedy 0.5897436 0.575
Horror 0.6666667 0.700
Romance/Drama 0.6410256 0.625

NEW Decision Tree

Precision recall from decision tree
class precision recall
Action 0.6756757 0.625
Feelgood/Comedy 0.5490196 0.700
Horror 0.7560976 0.775
Romance/Drama 0.7419355 0.575

From the decision tree, the precision and recall for the different genres are much better than the ones for the confusion matrices. Recall (sensitivity) is the amount of tracks that were correctly categorized to the right genre. The recall for the horror tracks is best, with a total accuracy of 75%. So 75% of the tracks that belong to Horror films, are correctly categorized. The precision and recall is lowest for Action movies.

NEW Density plot of tempi

# A tibble: 8 x 3
# Groups:   category [4]
  category        mode  median
  <chr>           <chr>  <dbl>
1 Action          Major  118. 
2 Action          Minor   98.9
3 Feelgood/Comedy Major  125. 
4 Feelgood/Comedy Minor  116. 
5 Horror          Major   88.6
6 Horror          Minor   90.3
7 Romance/Drama   Major  110. 
8 Romance/Drama   Minor  102. 


Median values for tempo
category mode median
Action Major 117.594
Action Minor 98.890
Feelgood/Comedy Major 125.056
Feelgood/Comedy Minor 115.990
Horror Major 88.599
Horror Minor 90.337
Romance/Drama Major 110.088
Romance/Drama Minor 101.931

On average, feelgood/comedy movies have a higher BPM than the other three genres (+/- 120 BPM). The average BPM for horror movies is the lowest (around 90 BPM). The distribution for romance/drama and action is about the same. It is also clear that overall, songs in minor key do have a slightly lower BPM than major-key songs, especially in action movies.

Histogram of keys


(change x-axis labels) A clear distinction between horror & action and romance/drama & feelgood/comedy: the first two genres contain a lot of tracks in the C#-key. However, the other two genres do not contain as many C#-key tracks.

Horror: G#-key.

The mean differences of timbre coefficients of different movie genres


Timbre, also known as “tone color”, is the perceived sound quality of a musical note or sound. It distinguishes various types of musical instruments. There are twelve timbre coefficients in total, the values are high level abstractions of the spectral surface ordered by degree of importance. The first coefficient represents the ‘average loudness’, the second points out ‘brightness’, the third is more closely related to the flatness of a sound and the fourth to sounds with a stronger attack. Increased levels of mid and high frequency content are referred to as ‘brighter’. A high flatness indicates that the spectrum has a similar amount of power in all spectral bands (i.e. similar to white noise) and low flatness indicates a “spiky” spectrum (mixture of sine waves). When comparing the twelve Spotify timbre coefficients between the four movie genres, the main difference lies in the second and third coefficient. For c02, horror music really differs from the other genres. It seems that the range is way more in the positive area. This is probably due to the use of stringed instruments in horror music.

The differences between an outlier and a typical track from horror film music


Based on the difference in valence and the similarity in energy, two tracks (Doll Box and Curse Your Name) are being compared. Doll box is different than the majority of the horror film music. It has a relatively high valence (0.854) for horror film music and is compared with Curse Your name, which is a very typical horror track with an extremely low valence (0.021). The energy values are more or less the same (0.0625 and 0.0725). Doll box is written in a major key, with very high pitched, distinctive tones. This chord is the G#major(9) chord. The chromagram of Curse Your Name is more smeared out, because a lot of different tones are played at the same time, as well as very high and very low pitched sounds, throughout the whole track. This track is played in a minor key and sounds a bit out of tune. This is very typical for horror film music, because a lot of tones are played at the same time, even if they do not sound harmonically ‘correct’ to the human ear.

The differences of timbral components of an outlier and a typical track in horror film-music [Cepstrograms]


From the graph of Doll Box, it is very clear which timbre components are being used in this track; c02, c03 and c04. The sound of this track represents the typical sound of a doll music box. Because these three components are very constant throughout the whole track, it is hard to distinguish the different sound characteristics. Two clear spots in this cepstrogram, are the very contrasting parts at t = 1 and at t = 32. These parts are riffs on a copper xylophone.

As you can see, the timbral components of Curse Your Name are much more spread out in the cepstrogram. This is very typical in horror music, because horror music tends to have a lot of different musical characteristics/instruments played at the same time, that seeks to give the audience an uncomfortable and unsettling feeling. The first bright yellow part of co2 is very distinguishable in this track. It represents a wind-instrument, probably a trumpet. Immediately after this part, a somewhat longer c01 appears. This part is a very low-pitched string-instrument, it sounds like a string bass. The yellow part of c05 is a very sharp sound, a high-pitched flute which is very unpleasant to the ear. It is clear that both very high- and low pitched sounds are being used at the same time in typical horror music.

When comparing c03 in both tracks: this sound is in both tracks a very high-pitched instrument, however, in doll house this sound is made by a percussion instrument. In Curse Your Name, this sound sounds more like a wind instrument.

Self-similarity marices of a horror outlier [Chroma and Timbre]


The visualisation on the left represents a self-similarity matrix for ‘chroma’. In this graphic, the x and y axes both represent the song ‘Doll Box’, an outlier from the horror film music playlist. It demonstrates at which points in the track the same pitches occur. The right visualisation represents the same song, but with ‘timbre’, also referred to as ‘tone color’.

The self-similarity matrix of timbre is very constant throughout the track, but there are two sections that stand out, at t = 1 and t = 32. As you may have noticed, these time segments correspond to the segments of the cepstrogram of the the previous page. This means that these represent the same riffs.

(I couldn’t adapt the ‘bars’ to smaller segments, I tried a lot of combinations but unfortunately it didn’t work. That’s why the matrices are a bit pixelated).

In what way does the film music from different movie genres differ, based on the emotional quadrant? [track-level-features]



This graphic shows the emotional quadrant of tracks played in movies from movie genres horror, action, feelgood-comedy and romance/drama. There is a clear distinction of the horror and action genres, where the tracks are mainly displayed on very low valence values. The tracks from horror movies are in the depressing / sad section of the emotional quadrant graph, with a lot of minor songs and are overall very ‘quiet’. Tracks from action movies are more smeared out, but locate mainly in the angry / turbulent and depressing / sad sections of the graph. Surprisingly, the majority of these tracks do not seem to be very loud at all. Feelgood-comedy tracks are more scattered, but in comparison with the other genres, this genre has a lot of tracks in the happy / joyful section with more louder songs. The tracks of romantic drama’s are localized throughout the whole plot, but do have a low valence overall. Louder songs have bigger dots, and color represents the mode of the song.

Can we predict movie genre based on just the film music?

The full emotional effect of a movie is mainly based on the music played, in combination with the visual information. Typical movie genres are romance, comedy, horror and action. From these movie genres, what musical features contain the most information? Emotions are key elements in these movie genres. In romantic dramas, emotions like ‘loving’ en ‘sense of longing’ are the main characteristics, but sometimes ‘sadness’ also plays a role, especially in romantic dramas. In horror movies, fear and anxiety are the main emotions expressed in music. Dark overtones will be present. In action movies, emotions with high intensity like excitement are essential. In feelgood-comedy movies, ‘happiness’ and ‘joy’ are the main emotions, so these tracks will contain a lot of musical elements from ‘happy’ music like major tones.

The corpus covers a presentable selection of typical movies within the four movie genres. This selection is based on the movie genre categorization of the Internet Movie Database (IMDB). Movies within a IMDB genre with typical features from the other genres were excluded, as well as ‘dialogue’ tracks from the Spotify albums. The romatic-drama playlist contains 331 tracks (11 movies), the feelgood-comedy playlist contains 235 tracks (16 movies), the horror playlist has 237 tracks (9 movies) and the action playlist has a total of 223 tracks (12 movies). Only spotify albums from the ‘Official Motion Picture Soundtrack’ were selected.

What are the expectations?

A horror movie will be defined as a movie that seeks to scare or unsettle the audience. I expect that this music is mostly written in the minor key. Music in horror movies wobbles and sound deliberately out of tune. For example, a lot of glissandi on violins (the screening upward). Pitch will be destabilized and pitch drops are used to stress the ‘unexpected’. One of the most iconic sounds, is the sudden sforzando tutti crash, designed to shock the audience instantly. It happens often in the midst of a musical silence, or after a pedal note.

Key features:
  • Atonality
  • Pitch destabilization (‘deformed’ sounds)
  • Spiky motif
  • sforzando tutti and glissando
  • Sudden silence
  • Low valence

Romantic dramas, generally, contain both ‘loving feelings’ and ‘sadness’, that is why this genre probably will oscillate between music written in both the major and minor key.

Key features:
  • Slow pace
  • Light tones
  • Longline and lyrical melodies
  • Medium valence: equal major and minor
  • Timbre: woodwind instruments, piano and harp

The action movie offers thrills (e.g. shooting) and spectacle (e.g. explosions).

Key features:
  • Fast tempo
  • High staccato
  • High pitch repetition
  • Timbre: brass and percussion instruments
  • Loudness: timbre component 1

Comedy movies are overall very happy. Happy tunes are written in the major key, are louder than other genres and probably more danceable with a high valence.

Key features:
  • Constant major
  • High loudness and energy
  • High pitch repetition
  • Timbre: Piano, strings instruments, few harmonics
  • Loud songs
  • High valence

The corpus

Action
  • Atomic blonde • Edge of tomorrow • Fast five/furious 7 • Hanna • John wick(s) • Kingsman: the secret service • Mad max: fury road • Mission impossible: fallout • Avengers: end game • Tenet • X-Men • Inception
Romance-Drama
  • Call me by your name • Before midnight • Pride & prejudice • The notebook • The fault in our stars • Atonement • The theory of everything • The age of Adaline • Brokeback mountain • The guernsey literary • After we collided • Little women
Horror
  • Get out • The lighthouse • A quiet place • The conjuring • Upgrade • Split • Hereditary • Doctor sleep • IT
Feelgood-Comedy
  • The nice guys • This is the end • They came together • Bridesmades • Hunt for the wilderpeople • The 40 year old virgin • The big sick • Scott pilgrim vs the world • 21 jump street • The way, way back • Girls trip • Crazy rich Asians • American hustle • Pitch perfect • Wet hot American summer • Napoleon dynamite

Conclusion and Discussion

The results support the notion that high intensity movies like action and horror, have musical cues that are measurably different from the scores of movies with more measured expression of emotion, like comedy and romance.